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ABSTRACT 

Gene expression profiles can be used to infer previ- 
ously unknown transcriptional regulatory interaction 
among thousands of genes, via systems biology 
'reverse engineering' approaches. We 'reverse 
engineered' an embryonic stem (ES)-specific tran- 
scriptional network from 171 gene expression 
profiles, measured in ES cells, to identify master 
regulators of gene expression ('hubs'). We dis- 
covered that E130012A19Rik (E13), highly expressed 
in mouse ES cells as compared with differentiated 
cells, was a central 'hub' of the network. We 
demonstrated that E13 is a protein-coding gene 
implicated in regulating the commitment towards 
the different neuronal subtypes and glia cells. The 
overexpression and knock-down of E13 in ES cell 
lines, undergoing differentiation into neurons and 
glia cells, caused a strong up-regulation of the 
glutamatergic neurons marker Vglut2 and a strong 
down-regulation of the GABAergic neurons marker 
GAD65 and of the radial glia marker Blbp. We con- 
firmed E13 expression in the cerebral cortex of adult 
mice and during development. By immuno-based 
affinity purification, we characterized protein 
partners of E13, involved in the Polycomb 
complex. Our results suggest a role of E13 in 



regulating the division between glutamatergic pro- 
jection neurons and GABAergic interneurons and 
glia cells possibly by epigenetic-mediated transcrip- 
tional regulation. 

INTRODUCTION 

Embryonic stem (ES) cells derive from the inner cell mass 
of blastocyst-stage embryos (1,2). The ES properties to 
self-renew (3) and differentiate in all three germ layers 
both in vitro and in vivo (4,5) have made these cells a 
unique in vitro system for studying the molecular mechan- 
isms that regulate lineage specification. High- throughput 
experimental techniques, combined to the use of systems 
biology approaches to infer gene regulatory networks 
(reverse engineering), have shown promise in the elucida- 
tion of stem cell renewal and differentiation (6). 

In this work, starting from a collection of ~200 gene 
expression profiles (GEPs) generated in mouse ES cells 
following overexpression of single genes (7), we 'reverse 
engineered' a transcriptional network encompassing 
ES-specific genes to identify master regulators of gene 
expression in ES cells ('hubs'). We discovered that a pre- 
viously uncharacterized gene, E130012A19Rik (E13), 
highly expressed in mouse ES cells as compared with 
differentiated cells, is a central 'hub' of the network. We 
generated E13-overexpressing and El 3 knock-down ES 
clones. We performed transcriptome analysis of these 
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clones and demonstrated an enrichment of differentially 
expressed genes involved in axon guidance and neuronal 
differentiation. By immune-based affinity purification, we 
identified protein-protein interactions of El 3 with compo- 
nents of the Polycomb chromatin remodelling complex 
and proteins involved in transcriptional regulation. We 
found that ES cells overexpressing E13 and differentiated 
into neurons and glial cells show up-regulation of 
the glutamatergic neurons marker Vglut2 (8) and down- 
regulation of both the y-aminobutyric acid (GABA)ergic 
neuron marker GAD65 (9,10) and of the radial glia marker 
Blbp (11,12), as compared with wild-type ES clones. We 
further demonstrated that El 3 is specifically expressed in 
the developing and adult cerebral cortex. Taken together 
our results show that El 3 has a role in regulating the 
commitment towards the different neuronal subtypes 
and glia cells. 

MATERIALS AND METHODS 

Data analysis of differentially expressed genes in ES cells 
versus differentiated cells 

We compared our collection of 171 ES-specific GEPs 
(GSE19836 and GSE32015) to a collection of 180 GEPs 
derived from normal mouse tissues and differentiated cell 
lines (GSE 10246) (13). The two data sets were first 
normalized together using the RMA algorithm (14). The 
median was chosen as measure of the expression values for 
each probe set within each data set. The variability of the 
data was taken into account by dividing this measure by a 
pooled variance given by the sum median absolute devi- 
ation of the genes expression values in the two data 
collections. Each probe set was thus associated with two 
coordinates representing median expression in the 
ES-specific data set and in the differentiated data set, 
and thus represented as a dot in Figure 1. The distance 
from the diagonal was computed, and an empirical 
P-value and a corresponding false discovery rate (FDR) 
were estimated to identify ES-specific transcripts. 

Regulatory network inference 

We used ARACNe (15) on 171 microarray experiments 
(GSE19836 and GSE32015) to reconstruct the transcrip- 
tional regulatory network (Supplementary File SI) 
in mouse ES cells, following the steps shown in 
Supplementary Figure SI. The gene network among the 
45 101 transcripts (probe sets) was inferred using as a 
significance threshold for the mutual information (MI) a 
P< 0.001 and setting the data processing inequality 
threshold to 0.01. The expression value of each probe set 
was averaged across biological replicates before ARACNe 
analysis, and a low-entropy filter was applied to remove 
probe sets whose changes were not significant across the 
data set, thus removing 4511 probe sets. The low-entropy 
filter removes non-informative probe sets by computing 
the entropy of each probe set across the data set as 
described in (16). Probe sets with entropy values less 
than the 10th percentile were removed from further 
analysis. 
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Figure 1. Scatter plot of the median gene expression level in mouse ES 
cells versus differentiated cells. The median level of expression of each 
gene in mouse ES cells (x-axis) and in differentiated cells (y-axis) is 
represented as a dot. The diagonal line corresponds to the set of genes 
with the same expression in ES cells versus differentiated cells. Genes 
corresponding to dots significantly off the diagonal (in magenta, genes 
with FDR <0.025 and in green FDR <0.05) represent either genes 
whose expression is lower in ES cells than in differentiated cells 
(above the diagonal), or genes whose expression higher in ES cells 
than in differentiated cells (below the diagonal). Some ES-specific 
markers (Oct4, Nanog and Sox2) and the novel ES differentially ex- 
pressed gene E130012A19Rik are highlighted. 



We validated the inferred network by computing the 
positive predictive value [PPV = TP/(TP + FP)] against 
two different 'Golden Standards' (GS): (i) the Reactome 
database: containing experimentally validated interactions 
from the literature (Supplementary File S2); and (ii) the 
ESCAPE (Embryonic Stem Cell Atlas from Pluripotency 
Evidence) database: containing putative transcription 
factor (TF)-messenger RNA (mRNA) regulatory inter- 
actions predicted from gene expression profiling in 
mouse ES cells (Supplementary File S3). The PPV 
represents the percentage of correctly inferred inter- 
actions, i.e. those interactions confirmed by one of the 
two GS. To compute the PPV, we first converted tran- 
scripts to genes and then selected only those genes 
present also in the 'Golden Standard' (and their inferred 
interactions). 

The number of predicted interactions in the inferred 
transcript-wise network is 299 610 among 40 590 tran- 
scripts, whereas the gene-wise network has 131 587 inter- 
actions among 17 645 genes. 

The ESCAPE GS and the inferred gene-wise network 
have in common 14151 of 17 645 genes. Among these 
14151 genes, there are 107 663 interactions in the 
ESCAPE GS, and 91925 interactions in the inferred 
network. Therefore, the random PPV for the ESCAPE 
GS is equal to 107 663/[(14 15T2-14 151)/2] = 0.0011. 
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The Reactome GS and the inferred gene-wise network 
have in common 3087 genes of 17 645 genes. Among those 
3087 genes, there are 53 933 in the Reactome GS, and 4973 
interactions in the inferred network. Therefore, the 
random PPV for the Reactome GS is equal to 53 933/ 
[(3087 A 2-3087)/2] = 0.0113. 

We also built a smaller ES-specific subnetwork by 
selecting only the 543 ES-specific genes (FDR < 0.025) 
and the genes they were connected to in the network 
(gene neighbours). To identify ES-specific 'hub genes', 
we first ranked the 5863 probe sets in the ES-specific 
sub-network according to their degree (i.e. the number 
of interactions a probe set has in the network) and 
retained only the top 100 probe sets with the highest 
degree. We then ranked the 5863 probe sets according to 
their ES-specific expression, i.e. with the smallest FDR 
(as detailed in the 'Identification of genes prevalently 
expressed in mouse ES cells' section) and retained only 
the top 100 probe sets with the most specific expression. 
Finally to identify ES-specific 'hub genes', we intersected 
the two lists of probe sets (highest degree versus the most 
specific expression) to obtain the 14 probe sets in Figure 2. 

Generation of ES clones 

E14Tg2a.4 [E14 (17)] and EBRTcH3 [EB3 (18)] parental 
cell lines were used. The EB3 cell line (18) was obtained 
from the laboratory of Dr Hitoshi Niwa as previously 
described in (7). Mouse ES cells were grown as previously 
described (7). The two E13-inducible cell lines (not-tagged 
and 3xFLAG-tagged) were derived from the EB3 cell line. 
For the generation of two exchange vectors (pPTHC-E13 
and pPTHC-E13-3xFLAG), we used the vector pPthC- 
Oct-3/4 obtained from the laboratory of Dr Hitoshi 



Niwa (18) and modified it as in (7). The primer pair for 
the generation of the two E13-inducible cell lines and for 
the selection of positive clones, performed as previously 
described, (7) are reported in Supplementary Table SI. 

The knock-down control (shCTL) clones and the El 3 
knock-down (shE13) clones were derived from the E14 cell 
line. For the generation of the pSuper.neo-shE13 and 
pSuper.neo-shGFP plasmids, the pSuper.neo vector 
(Oligoengine, Seattle, WA, USA) was used. The 
knock-down of El 3 mRNA was verified by quantitative 
real-time polymerase chain reaction (q-PCR) on total 
RNA extracted from three shE13 clones (shE13 A7, 
shE13 CI and shE13 C4), as compared with three 
shCTL clones (shC B6, shC CI and shC C3). The 
primer pair used for q-PCR was the E13-affy primer 
pair reported in Supplementary Table SI. Full details of 
the protocol can be found in (7). 

Induction of transgene expression in E13-inducible 
cell lines 

Three inducible non-tagged clones (CI, C3 and C6) and 
two inducible 3xFLAG-tagged clones (B5 and B8) were 
thawed, amplified and tested for transgene induction as 
previously described (7). q-PCR experiments were per- 
formed using LightCycler 480 II (Roche) for signal detec- 
tion. Primers are reported in Supplementary Table SI. 

Microarray hybridization and differential gene 
expression analysis 

For the analysis of E13-inducible cell clones, microarray 
hybridization experiments were performed on three biolo- 
gical replicates of non-tagged clones (CI, C3 and C6) 
induced in mouse ES media deprived of tetracycline (Tc) 
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Figure 2. ES-specific subnetwork inferred from the analysis of 171 GEPs. A reverse engineering algorithm was applied to the set of 171 GEPs from 
ES cells comprising >45 000 transcripts. The resulting network was used to obtain an ES-specific subnetwork by selecting only the 543 genes with 
ES-specific expression and the genes they were connected to. We then identified 'hub' genes in the subnetwork (numbered from 1 to 14) by ranking 
the 543 genes according to the number of connections. 
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for 48 h. As control, the same clones were grown in mouse 
ES media containing 1 ag/ml of Tc. Similarly, for the El 3 
knock-down cell line, experiments were performed on 
three biological replicates for the shE13 clones (shE13 
A7, shE13 CI and shE13 C4) and on two biological rep- 
licates for the shCTL clones (shC CI and shC C3). Three 
micrograms of total RNA from each clone were used and 
hybridized to the Affymetrix GeneChip Mouse Genome 
430_2 array (Mouse 430_2) using standard protocols. 
Differentially expressed genes were detected by a 
Bayesian /-test method [Cyber-t (19)] followed by FDR 
correction. The thresholds used were FDR < 0.05 for the 
induction set and FDR < 0.10 for the knock-down set (dif- 
ferent FDR thresholds were selected to have a comparable 
number of false positives in the two data sets). 

For the analysis of the Sugino microarray data set [(20) 
GSE2882], data were normalized using the RMA method 
(14). The normalized microarray data relative to El 3 were 
extracted and a one-way analysis of variance (21) was 
carried out. The Tukey multiple comparison procedure 
was then performed to identify cell types displaying a stat- 
istical difference in the expression of E13. 

ES cell differentiation protocol and data analysis 

Mouse ES cells were differentiated towards neurons and 
glial cells using the one-step differentiation method (22). 
The differentiation procedure was applied (i) to two El 3- 
inducible non-tagged (CI and C3) and to two 
3xFLAG-tagged clones (B5 and B8); and (ii) to two 
shE13 knock-down clones (shE13 A7 and shE13 C4) 
and to two shCTL knock-down clones (shC CI and shC 
C3). The morphology of differentiated cells was followed 
using a stereomicroscope (MZ16FA, Leica Microsystems, 
Wetzlar, Germany); images were acquired on a DFC 320 
camera (Leica). For each time course expression profile of 
the selected markers, we used a statistical modelling 
approach based on Gaussian processes (GPs) to identify 
those markers that were significantly affected by E13 
overexpression or knock-down as described in (23). GPs 
enable to quantify the true signal and noise embedded in a 
GEP over time and moreover provide a ranking of the 
genes according to their differential expression. The 
method estimates the continuous trajectory of the gene 
expression by means of GP regression. In particular, 
given an observed GEP, two different hypothesis HI 
and H2 are compared: either the gene is truly differential 
expressed (HI) or the observed profile is just the effect of 
random noise (H2). The log-ratio of the marginal likeli- 
hoods (llr) measures which of the two hypotheses is more 
likely, with positive values indicating that hypothesis HI is 
more likely, and vice versa for negative values. 

Western blotting and immunofluorescence analysis 

Fractioned cell ly sates from EB3 (CTL) cells and from two 
3xFLAG-tagged clones (B5 and B8) induced for 24 and 
48 h were prepared. Cytoplasmic (Cyt) and nuclear 
(Nuc) fractions were obtained using standard protocol 
(24). Forty micrograms of total protein extracts were 
fractionated on 8% sodium dodecyl sulphate- 
polyacrylamide gel electrophoresis (SDS-PAGE) gel, 



whereas 1 5 ug of fractioned protein extracts were 
separated on a 10% SDS-PAGE gel. Western blotting 
was then performed as previously described (7), and the 
following primary antibodies were used: a mouse mono- 
clonal anti-Flag M2-peroxidase antibody (Sigma) and a 
mouse monoclonal anti-(3-Tubulin (Sigma). 

For immunofluorescence analysis, ES cell clones were 
plated at a density of 1000 cells/cm 2 on gelatine-coated 
24-well plates and induced for 48 h in mouse ES media 
and in differentiation medium, respectively, deprived of 
Tc. The same clones grown in medium containing 
Tc were used as control. The following primary antibodies 
were used: an anti-Vglut2 (Abeam, ab79157), an 
anti-GAD65 (G1166, Sigma) and an anti-Blbp antibody 
(Abeam, ab32423). As secondary antibodies, we used 
AlexaFluor594 goat anti-mouse (1:400, Molecular 
Probes, Invitrogen) or AlexaFluor594 goat anti-rabbit 
(1:400, Molecular Probes, Invitrogen). In all immuno- 
fluorescence analysis performed, the 4 / ,6-diamidino- 
2-phenylindole (lOug/ml, Sigma) in phosphate buffered 
saline was used to stain the nucleus. Labelling was 
detected by fluorescent illumination using an inverted 
microscope (DMIRB, Leica Microsystems, Wetzelar, 
DE); images were acquired on a DC 350 FX camera 
(Leica). 

Co-immunoprecipitation (co-IP), mass spectrometry 
and liquid chromatography 

Co-immunoprecipitation (co-IP) was performed using the 
Sigma FLAG Immunoprecipitation Kit (Sigma Aldrich) 
according to the manufacturer's instructions. A total of 
10 7 cells from two inducible 3xFLAG- tagged clones (B5 
and B8) and two control clones from EB3 parental cell line 
(EB3/1 and EB3/2) were lysed for co-IP experiment. A 
cell-free negative control and a positive control with 
50 ng of FLAG-tagged bacterial alkaline phosphatase 
(BAP) protein were applied. After the elution of 
immuno-precipitated proteins, the samples were 
fractionated on SDS-PAGE (Bio-Rad mini format) 
followed by coomassie staining for the visualization of 
the quality of the samples. Each gel strip was cut into 20 
equal-sized gel slices and subjected to MS database-based 
protein identification after tryptic digestion. Liquid 
chromatography was performed on an Easy-nLC device 
(Proxeon, Denmark), which is directly coupled to ESI-MS 
analysis. Liquid chromatography (LC)/electrospray 
ionization mass spectrometry (ESI-MS)/MS was per- 
formed on a LCQ Deca XP ion trap instrument 
(Thermo Finnigan, Waltham, MA, USA). ESI-MS data 
acquisition was performed throughout the LC run. Three 
scan events, (i) full scan; (ii) zoom scan of most intense ion 
in (i); and (iii) MS/MS scan of most intense ion in (i) were 
applied sequentially. No MS/MS scan was performed on 
single charged ions. The raw data were extracted by 
TurboSEQUEST algorithm; trypsin autolytic fragments 
and known keratin peptides were filtered out. All DTA 
files of the same original sample (20 gel slices) generated 
by Sequest were merged and converted to mascot generic 
format files. The mascot generic format files were 
searched using our Mascot Version 2.1 in-house license. 
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The MS/MS ion searches against Expasy UniProt were 
performed with the following parameters: taxonomy 
Mus musculus, one trypsin miss-cleavage accepted, 
monoisotopic mass values, peptide and fragment mass tol- 
erance ±0.8 Da. Oxidation of methionine and acrylamide 
adducts on cystein was expected as variable modification. 
Protein hits corresponding to P < 0.05 (Mowse score >28) 
were considered as significant protein identifications. 
FDRs were estimated based on matches to reversed se- 
quences in the concatenated target-decoy database. A 
maximum FDR of 0.01 at both peptide and protein 
levels was tolerated. Protein-protein interaction data 
were submitted directly to Database of Interacting 
Proteins (DIP) database. 

To confirm some of the protein interactions identified 
by MS/MS, we proceeded as follows. Polyclonal rabbit 
IgGs against the following proteins were used: embryonic 
ectoderm development (EED) (Anti-EED, Millipore, 
09-774), suppressor of zeste 12 (Anti-SUZ12 C-terminal 
region, Aviva Systems Biology, ARP39190_P050), 
general TF HE beta (Anti-GTF2E2, Proteintech, 1 1 596- 
1-AP). The FLAG peptide tag (rabbit IgG against FLAG 
peptide epitope, Sigma) was used to detect El 3 expression 
in the E13-inducible ES clones with the 3xFLAG before 
and after IP experiment. Forty micrograms of protein 
extracts of the following eight samples were first separated 
by SDS-PAGE using the Laemmli system (Bio-Rad mini 
electrophoresis running chamber, 12% Polyacrylamide, 
gel format: 7.5x8 cm): E13-inducible clone B5 (B5) 
after FLAG-Tag IP, E13-inducible clone B8 (B8) after 
IP, control cell clone Eb3/A2 (Kl) after IP, control cell 
clone Eb3/B2 (K2) after IP, E13-inducible clone B5 before 
IP, E13-inducible clone B8 before IP, control cell clone 
Eb3/A2 before IP and control cell clone Eb3/B2 before 
IP. For the western blotting against FLAG-tag, a 
49-kDa FLAG-BAP fusion protein from Escherichia coli 
(Sigma-Aldrich, F7425) was used as the positive control 
(PK) for the immunoblotting methodology. Subsequently, 
proteins were transferred onto the supported nitrocellu- 
lose membrane using the standard procedure (semi-dry 
transfer chamber, 40 mA, 1.5 h). The One-Hour Western 
standard kit, including secondary rabbit antibody and 
TMB substrate, (GenScript L00204T, according to the 
manufacturer's instruction) was used for the immunode- 
tection and visualization. 

Immunohistochemistry (IHC) and in situ 
hybridization (ISH) 

Mouse tissue sections were prepared using standard proto- 
cols (25) Antisense RNA probes were labelled using a 
DIG-RNA labelling kit (Roche). The following probes 
were used: Gad67 and E13 (1100-nt length, containing 
the El 3 complete coding DNA sequence). RNA in situ 
hybridization (ISH) hybridization procedures, combined 
or not with standard immunohistochemistry (IHC), were 
performed as previously described (26). For IHC, the 
following primary antibodies were used:anti-Blbp, rabbit 
polyclonal antibody (1:100, Abeam); anti-GFAP rabbit 
polyclonal antibody (1:400, Dako); anti-Tbr2, rabbit poly- 
clonal antibody (1:1000, Chemicon); anti-NeuN, mouse 



monoclonal antibody, (1:200, Chemicon); anti-Vglutl, 
mouse monoclonal antibody, (1:100, Millipore); 
anti-Calbindin D-28k, rabbit polyclonal antibody 
(1:2500, Swant) and anti-TH, rabbit polyclonal antibody 
(1:200, Cell Signaling). All experiments in mice were con- 
ducted following guidelines of the Institutional Animal 
Care and Use Committee, Cardarelli Hospital (Naples, 
Italy). 

RESULTS 

Identification of genes prevalently expressed 
in mouse ES cells 

In a previous study, we generated a collection of 1 20 GEPs 
measured using microarrays in mouse ES cells by 
overexpressing 20 mouse orthologous of human chromo- 
some 21 genes [Gene Expression Omnibus (GEO) acces- 
sion number: GSE 19836] (7). We have now extended this 
collection by overexpressing eight more genes, comprising 
mostly TFs (Supplementary Table S2), thus generating 
51 additional GEPs (GEO accesion number: GSE32015). 

We analysed the entire data set of 171 GEPs to identify 
genes whose expression is enriched in mouse ES cells, as 
compared with differentiated cells. To this end, we 
compared our collection of ES-specific GEPs (GSE 19836 
and GSE32015) with a collection of 180 GEPs derived 
from normal mouse tissues and differentiated cell lines 
(GSE 10246) (13). Both sets were obtained with the same 
microarray platform (Affymetrix Mouse 430_2), enabling 
a homogeneous comparison. 

To identify genes that are significantly more expressed 
in ES cells versus differentiated cells, we computed the 
'median' expression level of each probe set in the two 
gene expression data set (ES cells and differentiated 
cells), which is shown as a dot in Figure 1. We then 
computed the distance from the diagonal (corresponding 
to the set of probe sets with the same expression levels in 
both ES and differentiated cells). We retained for further 
analysis only those probe sets whose distance from the 
diagonal was determined to be statistically significant. 
This approach enabled the identification of 543 genes 
that were significantly more expressed in ES cells versus 
non-ES cells (FDR < 0.025), independently of their 
absolute level of expression ('ES-specific genes'), among 
which many known 'sternness' genes, such as Oct4 
[Pou5fl (27)] and Nanog (28) (Supplementary Table S3). 
The gene ontology enrichment analysis (GOEA) (29,30) 
performed on the list of 543 genes confirms that this set 
is strongly enriched for sternness-related processes 
(Supplementary Table S4), such as stem cell differenti- 
ation (GO:0048863 - p = 9.6e-10), stem cell maintenance 
(GO:0019827 - p = 3.8e-7) and stem cell development 
(GO:0048864 - p = 5.4e-7), thus confirming the validity 
of our approach and its potential for the discovery of 
novel genes involved in ES cell fate regulation. 

Identification of a novel mouse gene as central 'hub' of an 
ES-specific gene regulatory network 

'Reverse engineering' approaches allow to infer gene-gene 
regulatory interactions by computational analysis of 
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GEPs (31,32). We applied an information-theoretic 
approach, named ARACNe (15), which uses a 
generalization of pairwise correlation coefficient, 
called MI, to identify co-regulated genes from GEPs. 
ARACNe was run on the collection 171 ES-specific 
GEPs. The resulting gene network consists of 299 610 pre- 
dicted interactions among 40 590 transcripts. To measure 
the reliability of the inferred network, we used two GS 
data sets: (i) a collection of 153 920 putative TF-mRNA 
regulatory interactions obtained from studies of TF loss- 
or gain-of-function followed by mRNA microarray 
profiling in mammalian ES cells (ESCAPE, http://www. 
maayanlab.net/ESCAPE/index.php); and (ii) a smaller 
set of high-quality experimentally verified functional inter- 
actions found in the Reactome database, an open-source, 
open-access, manually curated and peer-reviewed inter- 
action database (33). We then computed the percentage 
of correctly inferred interactions (PPV) by ranking inter- 
actions by the value of their MI ('Materials and Methods' 
section). As shown in Supplementary Figure S2 for the 
ESCAPE database and Supplementary Figure S3 for the 
Reactome database, the inferred interactions are signifi- 
cantly enriched for functional interactions. 

We then applied a hierarchical clustering algorithm 
based on the Jaccard distance to discover communities 
within the network (34-36). A community is defined as a 
set of genes strongly co-regulated with each other, but 
with few interactions with other genes in the network. 
We thus identified 53 communities (Supplementary 
Table S5). For each community, we performed GOEA 
to identify biological functions significantly enriched 
among genes within the community (Supplementary 
Table S6). 

To better analyse this large network, we built a smaller 
subnetwork, as shown in Figure 2, by selecting the 543 
ES-enriched genes (FDR < 0.025) described earlier and 
the genes they were connected to in the network (gene 
neighbours). This subnetwork comprises 5863 transcripts 
and 12 944 interactions among them. A graph containing 
the complete workflow, data sources and cut-offs thresh- 
old used to obtain the final ES-specific subnetwork is 
reported in Supplementary Figure SI. 

We identified 'hub genes' in this subnetwork, by select- 
ing genes with the highest number of inferred regulatory 
interactions, which were specifically expressed in ES cells 
('Materials and Methods' section). As shown in Figure 2, 
hub genes comprise the sal-like 4 (Sall4) TF, a known 
regulator of stem cells pluripotency (37); the genes of the 
Dppa family, known be involved in pluripotency and 
sternness (38); the zinc finger protein of the cerebellum 3 
(Zic3), which is required for maintainance of pluripotency 
in ES cells (39) and the undifferentiated embryonic cell TF 
1 [UTF1 ,(40-41)]. Interestingly, we found that one of the 
ES-specific 'hubs' was a gene with unknown function, 
E130012A19Rik (hereafter abbreviated as E13). 

To gain initial insight into its function, we performed 
GOEA on the 26 genes connected to El 3 in the network 
(Supplementary Table S7C). Such a 'guilty-by-association' 
approach has already been successfully applied to predict 
gene function (42). The results of this analysis suggested a 
role of E13 in transcriptional regulation and nucleic acid 



metabolism (Supplementary Table S7C). Moreover, El 3 
belongs to network community 2 (Supplementary Table 
S5), which is significantly enriched for biological pro- 
cess such as: pattern specification process (GO:0007389 
- p = 7e-4), axon guidance (GO:0007411 - p = 0.02), 
embryonic morphogenesis (GO:0048598 - p = 0.02) and 
embryonic development (GO:0009790 p = 0.03) 
(Supplementary Table S6). 

We also performed a bioinformatic analysis of the E13 
sequence. This gene is predicted to encode a hypothetical 
protein product (LOC103551, http://www.ncbi.nlm.nih. 
gov/gene/) that is highly conserved across vertebrates 
(Supplementary Figure S4A), and contains a proline-rich 
domain [PROSITE database (43)] (Supplementary Figure 
S4B). Analysis of the predicted 3D structure suggests that 
part of El 3 protein has similarity to the DNA-directed 
RNA polymerase II. However, it differs from other 
RNA polymerases because of the presence of additional 
peripheral 3D structures not present in normal RNA poly- 
merases. Three putative phosphorylation sites (SI 5, S19 
and S22) were annotated in PhosphoSite database (44) 
(Supplementary Figure S4C). In addition, El 3 may be 
regulated by KLF4, MYC, NANOG, OCT4, REST 
SOX2, TCF3 and TRIM28 according to chromatin 
immunoprecipation experiments reported in (45). 

Generation of E13 transgenic mouse ES cell lines 

We generated three stable mouse ES cell lines: two indu- 
cible cell lines overexpressing E13, which only differ for 
the presence in one of them of a 3xFLAG epitope at the C- 
terminus of the transgene-coding sequence, and one cell 
line in which El 3 was stably knocked-down 
(Supplementary Figure S5). 

q-PCR analysis ('Materials and Methods' section) con- 
firmed that the expression of the E13 mRNA was induced 
on the removal of Tc from the medium in both 
the overexpressing clones (Supplementary Figure S5A 
and B). Moreover, we verified the correct induction of 
the El 3 protein product and determined its intracellular 
localization by western blot (Supplementary Figure S6) 
with a FLAG-specific monoclonal antibody in the indu- 
cible 3xFLAG-tagged cell line ('Materials and Methods' 
section). As shown in Supplementary Figure S6A, we 
detected an ~43-kDa band corresponding to the 
E13-3xFLAG expected protein product (the El 3 molecu- 
lar weight is about 41 kDa plus 2.4 kDa of the 3xFLAG 
peptide), confirming that E13 is a protein-coding gene. 
The El 3 protein appears to be present in both cytoplasmic 
and nuclear fractions (Supplementary Figure S6B). 

The knock-down cell line (shE13 clones) was generated 
by stably expressing a specific short hairpin RNA 
against the El 3 sequence, thus knocking-down El 3 expres- 
sion in the parental mouse ES cell line [El 4 (17)]. As 
control, we selected a short hairpin RNA against the 
green fluorescent protein reporter, thus generating 
control knock-down clones (shCTL clones) ('Materials 
and Methods' section). The extent of inhibition of El 3 
expression was efficient (~90%) and comparable in the 
three different shE13 clones generated (Supplementary 
Figure S5C). 
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Transcriptome analysis of E13 transgenic mouse 
ES cell lines 

We performed gene expression profiling experiments using 
Affymetrix microarrays in both inducible not-tagged 
clones (n = 3) and in shE13 clones (n = 3) (GSE31701). 
We found that both the overexpression and knock-down 
of El 3 perturbed the transcriptome in a statistically sig- 
nificant manner ('Materials and Methods' section). In 
Supplementary Table S8 and in Supplementary Table 
S9, we report the complete list of differentially expressed 
genes in the two sets of experiments. 

To obtain a high-confidence set of genes responsive to 
El 3, we selected only those genes (n = 221) whose expres- 
sion levels changed consistently (i.e. increased in El 3 
overexpressing clones and decreased in knock-down 
shE13 clones or vice versa — 'Materials and Methods' 
section and Supplementary Table S10). A GOEA of the 
221 high-confidence genes revealed a significant enrich- 
ment for biological processes related to axon guidance, 
axonogenesis and other processes involved in neurogenesis 
(Table 1). 

To better characterize the role of El 3, we subdivided 
the list of 221 genes in two sublists: the first list 
(UP) consists of 145 genes up-regulated by the induction 
of E13 and down-regulated by its knock-down 
(Supplementary Table Sll); the second list (DOWN) 
includes 76 genes down-regulated by the induction 
of E13 and up-regulated by its knock-down (Sup- 
plementary Table SI 2). GOEA revealed that the UP list 
contains genes involved in anatomical structure develop- 
ment, axon guidance and other related processes, which 
suggest a possible involvement of El 3 in neurogenesis 
(Supplementary Table SI 3). On the other hand, the 
DOWN list consists of genes involved in the regulation 
of biosynthetic process and regulation of gene expression, 
suggesting a role of El 3 on global transcriptional regula- 
tion (Supplementary Table SI 4), and confirming the 
results obtained by the 'guilty-by-association' approach 
on the ES network. 

We also explored the network of the genes surrounding 
El 3, i.e. all the nodes at a Distance 2 from El 3 (Figure 3), 
and obtained a subnetwork composed by 106 genes and 
128 connections. We found that 63 of 106 genes were 



Table 1. Significantly enriched GOEA terms for the 221 
high-confidence genes 



Gene onthology terms 


FDR 


Fold 


P-value 






enrichment 




Axon guidance 


0.052 


7.5 


3.2E-4 


Axonogenesis 


0.069 


4.5 


4.5E-3 


Anatomical structure 


0.086 


1.6 


5.6E-3 


development 








Neuron projection morphogenesis 


0.098 


4.2 


6.4E-3 


Cell morphogenesis involved in 


0.11 


4.1 


7.5E-3 


neuron differentiation 








Significant GOEA terms, the FDR 


, the 


fold enrichment 


and the 



P-values for each term. GOEA was performed with the DAVID 
online tool restricting the output to all biological process terms with 
a significance threshold of FDR <0.1 and fold enrichment >1.5. 



indeed perturbed in their expression level by El 3 
(Supplementary Table SI 5). 

Identification of protein interaction partners of E13 

We performed immuno-based affinity purification experi- 
ments followed by mass spectrometric protein identifica- 
tion (46) using the E13-inducible ES clones with the 
3xFLAG ('Materials and Methods' section). We identified 
23 potential protein partners of El 3 with high-confidence 
(Table 2), among which there are two TFs [Gtf2e2, Btf3 
(47)], several mRNA processing proteins and two compo- 
nents of the Polycomb complex [Eed, Suzl2 (48) and the 
retinoblastoma binding protein Rbbp4, which may be 
involved in pluripotent stem cell maintenance and 
neuronal differentiation (49)]. Twelve of 23 transcripts 
corresponding to this subset of proteins were also differ- 
entially expressed following E13 overexpression 
(FDR < 0.1) (Supplementary Table S16). 

To further confirm these protein interactions, we 
selected 3 of the 23 proteins, taking into account the avail- 
ability of antibodies and their biological functions: Suzl2 
and EED (part of the Polycomb complex) and the general 
TF GTF2E2. We then immune-precipitated El 3 using the 
anti-3xFLAG antibody in El 3 overexpressing clones 
followed by western blot analysis with the three antibodies 
against the selected proteins (the experiment was per- 
formed in duplicate) ('Materials and Methods' section). 
For all the three proteins, we were able confirm the inter- 
action of El 3 with 3 of 23 potential protein partners, as 
shown in Supplementary Figure S7. 

The interaction of El 3 with these proteins further 
suggests a role of El 3 in mRNA processing and epigenetic 
regulation (Supplementary Table SI 7). 

E13 is a modulator of neuronal differentiation 

Results of the bioinformatic analysis and of the 
transcriptomics and proteomics experiments suggested a 
role of El 3 in regulating gene expression and neurogen- 
esis. To confirm a role of El 3 in this process, we specific- 
ally differentiated both E13-inducible and knock-down ES 
clones into neurons and glia cells using the one-step dif- 
ferentiation method (22). 

We first verified, by western blot analysis, the expression 
of the El 3 protein, at Days 10 and 15 of the differentiation 
protocol (Supplementary Figure S8). We then verified by 
q-PCR the potential of the inducible and of the 
knock-down El 3 clones to differentiate along the 
neuronal and glial cell lineages. To this purpose, we col- 
lected RNA samples at different time points and analysed 
the expression profiles of E13 and of the markers of un- 
differentiated ES cells [Oct4 (27)], and of neuronal precur- 
sors [Nestin (50) and Neurogl (51)]. 

Figure 4 shows the expression profiles of E13 
(Figure 4A), Oct4 (Figure 4B), Nestin (Figure 4C) and 
Neurogl (Figure 4D) in the inducible clones (left panels) 
and in the knock-down clones (right panels). All of the 
clones displayed the expected down-regulation of the 
pluripotency gene Oct4. We observed that during differ- 
entiation the expression profiles of the neuronal precursor 
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Figure 3. Subnetwork of genes surrounding El 3. The network comprises all of the nodes at a Distance 2 from El 3. It consists of 106 genes and 128 
connections. Nodes are coloured according to their differential expression (squares if significant, circles if not) following E13 overexpression (inner 
square/circle) or knock-down (outer square/circle). 



markers, Nestin and Neurogl were significantly, albeit 
transiently, affected by El 3 knock-down. 

To assess whether the induction and/or the knock-down 
of El 3 could influence the formation of radial glia cells 
and/or of some neuronal subtypes, we then analysed the 
expression of additional neuronal lineage specific markers 
(Figure 5). Vglut2, the vesicular glutamate transporter 2, 
which is essential for glutamate release from presynaptic 
vesicles in glutamatergic excitatory neurons (8), was spe- 
cifically up-regulated by E13 overexpression in the indu- 
cible clones (Figure 5A). In contrast, both GAD65, a 
glutamate decarboxylase specific for GABAergic neurons 
(10), and Blbp, the brain- specific member of the 
lipid-binding protein family, which is required for the es- 
tablishment of the radial glial fibre system in developing 
brain (11,12), were significantly down-regulated 
(Figure 5B and C). 

In addition, we also analysed the expression of Chat 
(choline acetyltransferase), a marker of mammalian 
cholinergic system (52), as well as that of the rate-limiting 
enzyme in dopamine biosynthesis, the tyrosine 
hydroxylase TH (53), and of the serotonin biosynthetic 
enzyme tryptophan hydroxylase Tph2 (54). The expression 
of Chat was significantly, albeit transiently, affected by the 
knock-down of El 3, whereas the expression of TH and of 



Tph2 did not change significantly following E13 induction 
or knock-down (Supplementary Figure S9). 

To verify whether the aforementioned variations in the 
expression of the glutamatergic, GABAergic and radial 
glial markers also resulted in a variation in the number 
of cells belonging to those specific subpopulations, we per- 
formed immunofluorescence experiments with anti- 
Vglut2, anti-GAD65 and anti-Blbp antibodies during 
differentiation of E13-inducible clones and knock-down 
clones (Supplementary Figure S10). We found that the 
number of Blbp-positive cells and of GAD65-positive 
cells in induced clones was significantly lower (negative 
binomial P = 0.02 and P = 0.009, respectively) than in 
uninduced control clones (Supplementary Figure SI 0B). 
These data suggest that the down-regulation of the 
GAD65 and Blbp transcripts (Figure 5B and C) following 
El 3 overexpression is because of the reduced number of 
GAD65-positive and Blbp-positive cells. 

Expression of E13 in brain 

The aforementioned results prompted us to study the 
expression pattern of E13 in the central nervous system. 
Towards this goal, we performed RNA ISH analysis with 
an E13 probe in mouse brain starting at embryonic Day 
12.0 (El 2.0), (Figures 6 and 7). At this stage, the first 
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Table 2. Proteins identified from the co-IP experiment as potential partners of El 3 



TTniPrnt IT) 

kJ 1111 1UL 1 IV 


PiP11 p cvmhol 
VJCllt/ ojlllUL'l 


1 1U lldlllC 


A/Toaxjcp 

1V1U W 


A/Tolpr*!! 1 fl T" 

IVlUlC^Llldl 


S 'A in i~\ 1 p 
ijdiiijjit/ 








score 


wpicrVit TOa^ 

wwitiiL iiyai 


name* 


Dirj 1 V 1 vy LJ k j LJ 


Btf3 


DciaiL ir j 


47 


22 017 


IC 


v 1 Ly* L> I 1V1 vy LJ Jj LJ 


P1 nhn 


L^uiiipieiueiiL LuinpuiieiiL i v^ auuLuiiipuiieiiL uiiitiiiig pioLciu 


j j 


^0 QQ4 


Tr 1 


PPSF6 MOTISF 

V l di vy l v i Y y kjJ-^ 


Cpsf6 


(^1pavao"p anH nnlvaHpnvlatinn snppifipitv faptnr siihiimt 


31 


59 1 16 


IC 


PO096 MOTISF 


Flldd J? A JO Rilr 


LJ HLLldl dLLCIlZCLl pi U LC1I1 Lvl/UIl^U llUlIlUlUgUC 


103 


38 071 


IC 


F)F)XS MOTISF 

1 y 1 y /v «y ivi W LJ OIj 


Ddx5 


Prnhahlp ATP-HpnpnHpnt R1\TA hplipasp F)F)X 

L lUUaUlC rvll VLCUCllU-Cll L JNl>zT. HCll^doC 1 y ly /v 


57 


69 277 


IC 


ppn A/TOT TSF 

LJ LJ LJ 1V1 V 7 LJ Jy LJ 


FpH 


Polycomb protein EED 




SO 1 fi£> 


TP 
1 V 


T-T19 MOT TSF 

II 1 Z 1 V 1 v y LJ k 3 LI 


LLlalllllL 


T-Tktnnp H 1 9 
JTLlaLUIie LJll.Z 


if, 

1 o 


21 254 


1 V 


T-TSQOA MOTISF 


T-TsnQOaa 1 


T-Tpat shnrk nrrvtpin T-TSP Q0-a1nha 

LLCdL 311ULIL piULClll JTLOJ; y\) "dipild 


35 


84735 


IC 


TPtT-TIA/T A/TOT TSF 


Tohol 

igngi 


Ig gamma- 1 chain C region, membrane-bound form 


DU 1 


■4-3 D Jy 


TP 
1 V 


TPl^P MOT TSF 
1 LJ IS. v 1V1 LJ o LJ 


Igkc 


Ig kappa chain C region 




I 1 771 

I I / / 1 


TP 

1 V 


TNT3 MOTISF 


Ints3 


TntpoTator rnmnlpY suhiimt ^ 

111 LCglalUl ^*J111|J1CA. OLIULIIIIL w> 


28 


117 862 


IC 


fdra MOT TSF 

IN. 1 I\ r\ 1 V 1 V y LJ 3 LJ 


i ipbdp 1 


Pnncnnnn nnc\/l fM/rAT^nncnn^tp c\^ntnptQCP_QccnpiQtpn irrntpin 1 
r llUapilUIllyUsyi pyiUpilUspildLC ay IlLllCLdoC dk>5>ULldLCU- piULCill 1 


142 


39 407 


IC 


FV9A7 MOTTSF 

JvVZrV/ 1V1 L7 LJ o LJ 


t nr^u^s 

LULOjOIOO 


Lg Kdppd Llldlll V -LL ICglUll ZD-1U 


7Q6 


1 9 96S 

1Z ZD J 


TP 


MT 1?R MOTTSF 

1VL L 1 _D 1VL v y LJ O LJ 


Mv11 ?h 
ivL y 1 1 z u 


A/T\//^cin 1 1 q "I/^ta/ liiTnl" pnQin 1 / k 

ivLyuaiii icguidLuiy iigin Liidiii lz,d 


116 


19 767 


IC 


MTF? MOTISF 

1V1 1 r Z 1 V 1 V y L J JL 


Mtf2 


A/fptci I-Tpcnnncp plpmpnt-ninrlni n I I ■ 9 
IVLCLdl ICspUIlSC C1C1I1CI1L- UillLllIlg IT Z, 


42 


66 882 


IC 


MVT-TQ A/TOT TSF 
1V1 I Ll V 1VLLJ' LJ 3 LJ 


A/TvtiQ 
iviyny 


M^yosm-9 




996 9^9 
ZZO ZjZ 


TP 
1 V 


1VFKQ MOTTSF 

1 > L. IN. y 1 V 1 v y L J jL 


Nek9 


OCIIIIC/ L11ICUI11I1C piULCill ILlIldoC IMdS-y* 


32 


107 015 


IC 


InTRTTR MOTTSF 

1 > IN 1 I\ 1 V 1 v y L J jL 


Nktr 


1MJN.-LU1I1UUI ICLUglllLlUIl piULCill 




163 341 


IC 


PPNA MOTISF 

L v - 1 > r\ ivivy U OL 




Prr\1i Ff^rti ti n ct r^f^H niir'lftir ',i n ti CTf^n 
L 1 vjlllCl d Llllg ^Cll llLA^lCdl dllLlg,Cll 


72 


28 766 


IC 


RRRP4 A/TOT TSF 

l\DDl H 1V1 LJ Jy LJ 


R KVm4 
IN UDpH- 


Histone-bmdmg protein RBBP4 


JO 


47 69 s 

L t 1 OZJ 


TP 
1 V 


dt 9QA MOTTSF 

IN L _ J> rL 1V1 U LJ 3 LJ 


rvpiz J>d 


OUk3 llUUsUIlldl piULCill LZJd 


240 


1 7 684 


TP 


RT 28 MOTTSF 


Rpl28 


60S nhnsnmal nrntpin T 98 
vyUkj 1 1 livj o vj 1 1 1 d i |ji Kj uciii i^z.o 


98 


15 724 


IC 


RS1 1 MOTTSF 

LVOll 1 V 1 v y L J O LJ 


Rpsl 1 


H-UkJ IlUUbUlIldl pi U LC1I1 Oil 




18 419 


IC 


RSn MOTISF 

LVkjlJ) 1 V 1 V y LJ k y L. 


Rpsl3 


AOS riKn«nmal Twai^m S1 ^ 
T-Ukj IlUUSUllldl pi U LC111 O 1 J 


105 


17 212 


IC 


ccty A/TOT TSF 

ODl A 1V1 \ J LJ o LJ 


SptY 

oetA 


Probable helicase senataxm 




9Q7 401 
Z.y 1 H-U 1 


TP 
1 V 


SPTA? MOTTSF 

O I 1 rL _ 1 V 1 v y LJ o LJ 


Sptanl 


opCLLIlIl dipild Llldlll, Uldlll 


658 


284422 


IC 


SPTR9 MOTTSF 

Or 1 Ij_ 1V1 v y LJ kj LJ 


Snthn 1 


kjpeLLllIl UCLd Llldlll, Uldlll 1 


607 


9740S9 

z / 1 UJZ 


TP 

1 V 


ST 171? MOTTSF 

OLJZ^lZ, 1 V 1 V y LJ k y LJ 


Suzl2 


LUiyLUlIlU piULCill jUZIZ 


78 


82974 


IC 


T?FR MOTISF 

1 — L 11 1 V 1 v y k_> L 


VJ Ll iCZ, 


(^rPnpi*ti1 TF TTF ciinnnit 9 

VJt/llt/ldl 11 lll-i olLULlllll Z. 


42 


33 026 


IC 


TT-TOP4 MOTTSF 

1 LL v y L H 1V1 v y L J 3 LJ 


Thoc4 


1 U i i nnmnlpv en nnmt /I 
1 LlVy LUIlipieX aUUUIllL t 


OH- 


26 924 


TP 

1 V 


T TRTO A/TOT TSF 


Rr»c97a 
rvpsZ / d 


Ubicjmtm 


1 OS 

1U J 


8S60 


TP 
1 V 


AfTR MOTTSF 

r\ L 1 L) 1V1 L7 LJ 3 LJ 




A / » 1 1 n f * \ ■ 1 nn 'i k" ni 1 1 * 1 
/ALUIl, Ly lUpidMlllL 1 


947 


41 710 


CTL 


T-TTnTRPF MOTTSF 

LL 1 > l\ f I 1 V 1 U LJ Jy LJ 


LJLI1I Iipi 


I— 1 \ / m"/~\ (T/^ii/^/^i i c ii i w * 1 / ^ ' i r n nAniiAlpAr\rAtpin 1 J 

LLeieiOgeiieous nuLiedi iiuuiiuLieupiuLeiii r 


1 70 

1 /u 


4S 701 

t J / U 1 


CTL 


TOT-T1M MOTTSF 

ILJLlllVl 1 V 1 Vy LJ k y LJ 




Trr ( t r i m m r i _ 1 f mi 'l i ii i t-/ ^ ( y i / ~\ ii in ^tii lii"'i 1 1 / ^ _ / 1 1 1 iiii l/^i" in 
Lg gdlllllld 1 Llldlll L^ ICglUIl, lllClIlUIdllC UUUI1U- 1UI1I1 


699 


43 359 


CTL 


TPl^P A/TOT TSF 
1 LJ IN L 1V1 VJ LJ o LJ 


Igkc 


Ig kappa chain C region 


^06 


I 1 771 

I I / / 1 


PTT 
L 1 L 


KV9A7 MOTTSF 

IX v 1 1VL V y LJ ij \—i 


T OC6^646R 

LV/LUJUtUO 


Tcr Vanna r A li i n \/ _ 1 1 rpmnn 9 /i _ 1 0 
Lg ILdppd Llldlll V LL ICglUll Z,U - L\J 


796 


12265 


CTL 


MYHld" MOUSE 


MyhlO 


Myosin- 10 


1883 


228 855 


CTL 


PRPS1 MOUSE 


Prpsl 


Ribose-phosphate pyrophosphokinase 1 


123 


34 826 


CTL 


RS5 MOUSE 


Rps5 


40S ribosomal protein S5 


84 


22 875 


CTL 


IGH1M MOUSE 


Ighgl 


Ig gamma- 1 chain C region, membrane-bound form 


699 


43 359 


P-CTL 


IGKC MOUSE 


Igkc 


Ig kappa chain C region 


310 


11771 


P-CTL 


KV2A7 MOUSE 


LOC636468 


Ig kappa chain V-II region 26-10 


368 


12265 


P-CTL 


PPB ECOLI 


phoA 


Alkaline phosphatase OS = E. coli (strain K12) 


6665 


49 408 


P-CTL 


IGH1M MOUSE 


Ighgl 


Ig gamma- 1 chain C region, membrane-bound form 


625 


43 359 


N-CTL 


IGKC MOUSE 


Igkc 


Ig kappa chain C region 


735 


11771 


N-CTL 


KV2A7 MOUSE 


LOC636468 


Ig kappa chain V-II region 26-10 


535 


12265 


N-CTL 



IC = inducible cell line; CTL = control cell line; P-CTL = positive control with flag-tagged BAP protein; N-CTL = negative control without protein. 
Two inducible cell samples, two control cell samples, one negative control and one positive control were processed simultaneously. 
Protein hits corresponding to P < 0.05 (Mowse score >28) were considered as significant protein identifications. 



post-mitotic excitatory neurons are clearly visible at the 
pial border of the dorsal pallium (55,56), while the inter- 
neuronal population emerging in the ganglionic eminences 
of the ventral pallium is approaching its tangential migra- 
tion (57-59). 

E13 mRNA was strongly expressed in the developing 
cerebral cortex, from which glutamatergic neurons and 
glial cells derive (Figure 6A). At higher magnification, a 
graded expression of E13 in the developing cortex 
(Figure 6B) is clearly visible, from the ventricular (low) 
to the pial (high) surface. By double ISH/IHC, E13 
appears to be strongly expressed in the post-mitotic 
neuronal population of the pre-plate, as marked by the 
neuronal nuclei staining [NeuN, Rbfox3 (60)] 



(Figure 6B). E13 levels gradually increase in the subven- 
tricular zone, co-localizing with the intermediate progeni- 
tor cells, which are positive for the neuronal marker Tbr2 
(Figure 6B) (61-63). 

E13 expression in adult brain was localized to cortical 
glutamatergic neurons by RNA ISH in P21 mouse brain, 
and appeared to be particularly high in the cerebral cortex, 
especially in cingulate (not shown), somatosensory and 
piriform cortex, in the hippocampal formation, and in 
the amygdaloid nuclei (Figure 7A). A detailed analysis 
of E13 in the cortical domains indicated a large represen- 
tation in all the cortical layers, showing high 
co-localization with the vesicular glutamate transporter 
1, Vglutl, a marker of adult glutamatergic cells, whereas 
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Figure 4. Expression profile of E13, Oct4, Nestin and Neurogl during neuronal differentiation. The expression profiles of E13 (A), Oct4 (B), Nestin 
(C) and Neurogl (D) transcripts were evaluated by q-PCR in two inducible not-tagged clones derived from parental ES cell line EB3 (left panels) and 
in two knock-down clones derived from parental ES cell line E14 (right panels). For El 3 transcript, E13-3xFLAG primer pair and E13-affy primer 
pair were used in inducible and in knock-down clones, respectively. The red bar represents the control, the green bar represents the expression profile 
of each transgene in the inducible and in the knock-down clones, respectively. For each expression graph, we reported on the x-axis the days of the 
differentiation protocol and on the j-axis the relative expression of the transcript expressed as 2-dCT. For each expression profile, the llr was 
calculated, and the value was reported on the top of each graph. The llr represents the statistical significance value of the difference in the expression 
profiles of each transgene in the inducible and in the knock-down condition compared with each control. The statistical significance is indicated by 
asterisk (llr > 0). 
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A (Inducible clones derived from EB3) (Knock-down clones derived from El 4) 




Figure 5. Expression profile of Vglut2, GAD65 and BLBP during neuronal differentiation. Expression analysis in two inducible non-tagged clones 
derived from parental ES cell line EB3 (left panels) and in two knock-down clones derived from parental ES cell line E14 (right panels) was evaluated 
by q-PCR: Vglut2 is a glutamatergic neuron marker (A), GAD65 is a GABAergic neuron marker (B), BLBP is a radial glia marker (C). The red bar 
represents the control, the green bar represents the expression profile of each transcript in the overexpressing and in the knock-down clones. For each 
expression graph, we reported on the x-axis the days of the differentiation protocol and on the j^-axis the relative expression of the transcript 
expressed as 2-dCT. For each expression profile, the llr was calculated, and the value was reported on the top of each graph. The llr represents the 
statistical significance value of the difference in the expression profiles of each transgene in the inducible and in the knock-down condition compared 
with each control. The statistical significance is indicated by asterisk (llr > 0). 



no co-localization is apparent in Blbp-positive (11,12), or 
GFAP-positive mature astroglial cells, in the cortex 
(64,65) (Figure 7B). 

E13 is also highly represented in the hippocampal for- 
mation, in both the glutamatergic principal neuronal 
populations, the pyramidal neurons of the Cornus 
Ammonis 1 (CA1) to the CA3 fields of the hippocampus 



proper and the granule cells of the dentate gyrus (DG) 
(Figure 7C and D). Interestingly, in the DG, one of the 
two brain niches where neurogenesis occurs throughout 
adult life (66-70), E13, shows a gradient similar to that 
observed in the developing cortex, with high levels in the 
mature granule cells, as confirmed by co-localization with 
the calcium binding protein, Calbindin (Figure 7C) (71). 
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Figure 6. Expression of E13 in developing mouse brain. (A) Expression of E13 during mouse brain development evaluated by RNA ISH at 
embryonic Day 12.0 (E12.0). High levels of E13 are detected in the dorsal pallium from the ventricular to the pial surface, along the entire 
rostro-caudal (R-C) axis. In the ventral pallium, El 3 expression is restricted to the ventricular zone of the ganglionic eminences (dashed lines). 
Scale bar: 500 fim. (B) High magnification of E12.0 embryonic brain at the rostro-caudal level indicated by the asterisk in (A). A graded expression of 
E13 is clearly visible in all the cortical primordium. Double ISH/IHC shows that although at low levels of expression. E13 levels gradually increase 
and co-localize with the Tbr2+ intermediate progenitor cells of the subventricular zone, which are about to undergo the final neurogenic division. 
The maximum rate of E13 is evident in the post-mitotic neurons of the preplate that are positive to NeuN staining (arrows and arrow heads in the 
higher magnification and insets). Please note that in the ganglionic eminence, El 3 is detected only in the ventricular zone. Scale bars: 100 fim, 50fim 
in high magnifications, ctx, cortex; hp, hippocampus; vz, ventricular zone; svz, subventricular zone; ppl, preplate; LGE, lateral ganglionic eminence; 
MGE, medial ganglionic eminence; CGE, caudal ganglionic eminence; mz, mantle zone; Th, thalamus; Hyp, hypothalamus. 



To evaluate the expression of E13 in neural populations 
whose transcriptome profiles, in vitro, do not appear to be 
influenced by El 3 manipulation, we examined the 
TH-positive population that correspond to dopaminergic 
neurons (53). We followed this population history, from 
the embryonic brain development to the adulthood, 
and we found no co-localization with E13 neither at 
E12.0 (Supplementary Figure SUA), nor at P21 
(Supplementary Figure SUB). 

To further assess in which neuronal subtype E13 gene 
was significantly expressed, we analysed 36 GEPs collected 
from 12 different neuronal cell types in the adult mouse 
forebrain (GSE2882) (72). As shown in Supplementary 
Figure SI 2, the expression of El 3 varies significantly 
across the 12 different neuronal cell types (analysis of 
variance P = 3 x 10 -6 ), with higher expression in the five 
glutamatergic pyramidal neuron populations, as compared 
with the six GABAergic populations of interneurons. 



DISCUSSION 

Regulatory interactions among genes can be 'reverse en- 
gineered' by considering pairs of genes and checking 
whether they are co-expressed across different experimen- 
tal conditions ('co-expression' networks). Reverse engin- 
eering is a powerful tool to generate hypotheses on gene 
function (73). In this work, we have produced a collection 
of GEPs measured in mouse ES cells from our previous 
study (7) together with a new collection of microarrays 
and used a systems biology reverse engineering approach 
to gain initial insight into the functional role of a previ- 
ously uncharacterized gene, E130012A19Rik. We applied 
the ARACNe reverse engineering algorithm, which is 
based on computing a pairwise MI between two genes, 
as the nature of the GEPs we collected prevented use of 
more sophisticated strategies (such as those based on time 
series GEPs). We observe that there is a plethora of 
reverse engineering algorithms available, and new and 
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Figure 7. El 3 is expressed in cortical glutamatergic neurons and in the radial glia stem cells-like of the SGZ neurogenesis niche in the adult brain. 
(A) RNA ISH on P21 mouse brain reveals a strong signal of El 3 in the cerebral cortex, in the hippocampus and in the amygdaloid nuclei 
(arrowheads). Scale bar: 500 tun. (B) A more detailed analysis of E13 expression in primary somatosensory cortex indicates a widespread signal 
of El 3 in all the cortical layers. Co-localization of the vesicular glutamate transporter 1, Vglutl, is detected in El 3+ cortical neurons (high 
magnification and arrow in insets). No co-localization of El 3 with Blbp and GFAP is found (high magnification and arrow in insets). Scale bar: 
100 jim. (C) In the hippocampus, El 3 expression is present in both the glutamatergic main neurons: CA pyramidal cells and the DG granules. 
Interestingly in the DG, El 3 shows a gradient similar to that found in the developing cortex, namely higher in the more mature granule cells. Scale 
bar: 50jim. (D) By adjacent ISH for E13 and GAD67 in P21 brains, at the hippocampal level [dotted box in (A)], E13 appears also in the in the 
GABAergic interneurons populating the oriens and radiatum strata of the hippocampus (arrows), so, stratum oriens; Pyr, pyramidal layer; sr, 
stratum radiatum; Rs, Retro splenial cortex; Ssl, primary somatosensory cortex; Ss2, secondary somatosensory cortex; Pir, Piriform cortex; BA, 
basal amygdaloid nuclei; Hp, hippocampal formation; CA1, Cornus Ammonis field 1 of hippocampus proper; Iml, inner molecular layer; GCsL, 
granule cells layer; sgz, subgranular zone; po, polymorphic layer. 



improved methods are being developed (32). Therefore, 
different reverse engineering methods could reveal differ- 
ent aspects of the network, and thus either confirm or 
reveal additional roles of El 3, as well as of other genes 
with unknown functions. 

Interestingly, E13 was the uncharacterized ES-specific 
'hub' gene with the highest number of connections in the 
inferred network. Hub genes have been found to be master 
regulators of specific transcriptional programs both in 
normal and pathogenic conditions (74,75). Here, by 
transcriptomics, immuno-based affinity purification 
experiments and ISH, we indeed identified a role of El 3 
in neuronal subpopulation specification. 

The capability of neurons to adopt the correct 
neurotransmitter phenotype during early development is 



the critical point for the proper functioning of the 
vertebrate adult nervous system. A multiple array of 
neuronal types have to arise from a field of undifferenti- 
ated progenitors; how a cell acquires a given neurotrans- 
mitter phenotype is a central issue in developmental 
neurobiology (76,77). This is particularly true for 
glutamatergic and y-aminobutyric acid GABAergic 
neurons, which are the most abundant excitatory and in- 
hibitory neurons, respectively, in the vertebrate's central 
nervous system. It is likely that only a proportion of the 
factors required for neuronal identity have so far been 
identified, and the precise way in which such factors 
interact to specify the timing and terminal differentiation 
of particular neuronal subpopulations is not yet 
defined (78). 
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Figure 8. Proposed mechanism of action of El 3 in regulating neuronal 
cell differentiation. El 3 may interact with proteins of PRCs to regulate 
neuronal specification by modulating the transcriptional program of 
differentiating cells. 



In addition, the emerging role that epigenetic control 
plays in brain development implies that the interplay of 
TFs and epigenetic modifiers, including histone modifica- 
tions, DNA methylation and microRNAs, is essential for 
the acquisition of specific cell fates (79). Several 
chromatin-modifying complexes regulate the renewal or 
differentiation of a range of neural stem cells, but the 
Polycomb repressor complexes (PRCs) are of particular 
interest in this context. This is particularly true for some 
members of PRC2, such as EzH2, whose alterated expres- 
sion is able to change the competence of cortical progeni- 
tors to generate neurons of different cortical layers, 
orchestrating the switch from neurogenesis to gliogenesis 
(80,81). Members of this Polycomb complex are also 
required for the subgranular zone (SGZ)-hippocampal 
adult neurogenesis (82,83) and impact excitatory 
synaptic plasticity (84,85). 

Here, we show that El 3 may be a component of the 
genetic and epigenetic networks controlling neuronal spe- 
cification. Indeed, induction of the expression of El 3 in 
stable ES cell lines results in the significant up-regulation 
of markers of glutamatergic excitatory neurons, and a 
strong down-regulation of the GABAergic neurons and 
radial glia markers. Our study further suggest that the 
effects of El 3 on neuronal differentiation may be exerted 
via epigenetic mechanisms; the results we obtained by 
immune-based affinity purification show that El 3 inter- 
acts with Eed (EED gene) and Suzl2 (suppressor of 
zeste 12 homologue), which are both components of 
the PRC-EED-EZH2 Polycomb chromatin modelling 
complex (48), as well as the retinoblastoma binding 
protein Rbbp4, a core histone-binding protein (86). 

Taken together, the interaction of El 3 with these 
proteins suggest it as member of the epigenetic regulation 
machinery of the neuronal subtypes and glia commitment 
(Figure 8). Further studies, including mouse models and 
chromatin IP, are needed to clarify the specific mechan- 
isms by which El 3 exerts its function on neuron 
specification. 
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